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QUANTIZATION FOR UNIFORM DISTRIBUTIONS ON EQUILATERAL 

TRIANGLES 

CARL P. DETTMANN AND MRINAL KANTI ROYCHOWDHURY 


Abstract. We approximate the uniform measure on an equilateral triangle by a measure sup¬ 
ported on n points. We find the optimal sets of points (ra-means) and corresponding approx¬ 
imation (quantization) error for n < 4, give numerical optimization results for n < 21, and a 
bound on the quantization error for n —> oo. The equilateral triangle has particularly efficient 
quantizations due to its connection with the triangular lattice. Our methods can be applied to 
the uniform distributions on general sets with piecewise smooth boundaries. 


1. Introduction 


The representation of a given quantity with less information is often referred to as ‘quantiza¬ 
tion’ and it is an important subject in information theory. It has broad applications in signal 
processing, telecommunications, data compression, image processing and cluster analysis. We 
refer to 


GG 

GN 

Z] for 

see also 

GKL] 


GKL|. Rigorous mathematical treatment of the quantization theory is given 


in Graf-Luschgy’s book (see [GLl]). 


Let P denote a Borel probability measure on M. d and let || • || denote the Euclidean norm on 
M. d for any d > 1. We consider an approximation of P by a measure supported on at most a 
finite number of points, n. The nth quantization error for P is defined by 


V n := VJP) = inf 


mm 

a£a 


\x — a\\ 2 dP(x) : a C M d , card(a) < n 


where the infimum is taken over all subsets a of with card(a) < n for n > 1. Notice that if 
f ||x|| 2 dP(o:) < oo, then there is some set a for which the infimum is achieved (see GLl ). This 
set a can then be used to give a best approximation of P by a discrete probability supported 
on a set with no more than n points. Such a set a for which the infimum occurs and contains 
no more than n points is called an optimal set of n-means, or optimal set of n-quantizers. It 
is known that for a continuous probability measure P an optimal set of n-means always has 
exactly n elements (see [GLl]). The probability measure P considered in this paper is a uniform 
distribution which is absolutely continuous with respect to the Lebesgue measure A, and so there 
exists a probability density function /, known as Radon-Nikodym derivative of P with respect 
to A, with / > 0 and f fd\ = 1 such that for any Borel subset B C M d , we have 

(1) P(B) = f fdX. 

J B 

Given a finite subset a C M d , the Voronoi region generated by a 6 a is defined by 

M(a\a) = \x G M. d : llx — all = min 11 a; — 611} 

1 L b£a 


i.e., the Voronoi region generated by a G a is the set of all points in which are closest to 
a G a, and the set { M(a\a ) : a G a} is called the Voronoi diagram or Voronoi tessellation of a. 
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A Borcl measurable partition {A a : a G a} of R d is called a Voronoi partition of R d with respect 
to a (and P) if P-almost surely, we have 


A a C M(a\a) for every a € a. 

Notice that if a = {ai, a 2 , ■ ■ ■ , a n } is an optimal set of n-means for P and {A 1: A 2 , ■ 
a Voronoi partition with respect to a, then 






2 dP(x). 


Let us now state the following proposition (see 


GG.GLlj). 


, A n } is 


Proposition 1.1. Let a be an optimal set of n-means, a G a, and M(a\a) be the Voronoi 
region generated by a G a, i.e., 

M(a\a) = {x G R d : ||x — a|| = min ||x — b\\}. 

b£a 

Then, for every a G a, 

(*) P(M(a\a)) > 0, (ii) P(dM(a\a )) = 0, (Hi) a = E(X : X G M(a\a)), and (■ iv ) P-almost 
surely the set {M(a\a) : a G a} forms a Voronoi partition of M. d . 


Let a be an optimal set of n-means and a G a, then by Proposition |1.1[ we have 

1 f 


a = 


P(M(a\a)) 


xdP = 


fu(ala) xdP _ f M (a\a) xf(x)d\ 


L 


dP 


f M (a\a) f( X ) dX 


• M{a\ot) JM(a\a) KAJ± JM(a\a) 

which implies that a is the centroid of the Voronoi region M(a\a) associated with the probability 
measure P (see also |DFG]). 

The classical Cantor set C is generated by the two contractive similarity mappings Si(x) = 
and S 2 (x) — + | for all x G M. Then, there exists a unique Borel probability measure P 

on R with support C such that P = 'P o Sf 1 + \P o S^ 1 , where P o S^ 1 denotes the image 
measure of P with respect to Si for i — 1,2 (see (H|). Such a probability measure is mutually 
singular with respect to the Lebesgue measure, and in GL2 , Graf-Luschgy investigated the 
optimal quantization for this measure P. 

In this paper, we have considered a uniform distribution on an equilateral triangle, and 
investigated the optimal sets of n-means and the nth quantization error for this distribution 
for all n > 1. Moreover, in Theorem 3.1, we have shown that the Voronoi regions generated 


by the two points in an optimal set of two-means partition the equilateral triangle into an 
isosceles trapezoid and an equilateral triangle in the Golden ratio. In subsequent sections, we 
find the optimal sets of three- and four-means. In the last section, in Theorem G3 and in its 
corollary, we have given some numerical optimization results and conjectures about the optimal 
configurations for n points, a rigorous bound on the quantization error for n —> oo, and a final 
conjecture about uniform distributions in more general geometries. 

Our approach illustrates methods for far more general geometries, including the use of symme¬ 
try to find optimal sets for small n, numerical optimisation for intermediate n, and configurations 
close to the triangular lattice for large n. Efficient quantization due to matching of the bound¬ 
aries to a triangular lattice is only possible in polygons with all angles a multiple of 7t/3. The 
simplest and most natural example of this is the equilateral triangle. 


2. Some basic results relating to quantization and uniform distributions 

In this section we give some basic results relating to optimal sets and the uniform proba¬ 
bility distributions defined on equilateral triangles. Let X = (X \, X 2 ) be a bivariate contin¬ 
uous random variable with uniform distribution taking values on the triangle A with vertices 
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(0,0), (1,0), (1, ^). Then, the probability density function (pdf) f(x i,x 2 ) of the random vari¬ 
able X is given by 

for 0 < X\ < 1 , 0 < x 2 < v^i, 

^ for 1 < xi < 1 , 0 < x 2 < —\/3aq + y/3, 

0 otherwise. 

Notice that the pdf satisfies the following two necessary conditions: 

(i) f(x 1 , 2 : 2 ) > 0 for all (aq, x 2 ) G M 2 , and 

(**) /Jr 2 f(x 1 ,x 2 )dxidx 2 = ff ff Xl f(x 1 ,x 2 )dx 2 dx 1 + fi f( Xl , x 2 ) dx 2 dx t = 1 . 

Moreover, one should notice that the pdf of the bivariate random variable X can also be 
written in the following form: 


= (orO< X2 <f f 3 < Xl < 

0 otherwise. 


y/ 3 —X 2 
/3 ' 


Let fi(x\) and f 2 (x 2 ) represent the marginal pdfs of the random variables X\ and X 2 respectively. 
Then, following the definitions in Probability Theory, we have 

/ OO POO 

f(xi,x 2 )dx 2 and f 2 (x 2 ) = / f(x 1 ,x 2 )dx 1 . 

-00 J —OO 

Since f ,/ 3 ' 1 f(xi, x 2 ) dx 2 = Ax\ for 0 < x\ < |, and f Q f(x lt x 2 ) dx 2 = 4(1 — x\) for 

1 < x\ < 1 , we have 

f Ax\ for 0 < .T] < } 2 , 

4(1 - Xi) for 1 < xi < 1, 

0 otherwise. 


Similarly, we can write 


/,(*» )={X/- 2 X forO<, 2 <f. 

0 otherwise. 


Notice that both fi(x\) and f 2 (x 2 ) satisfy the necessary conditions for pdfs: /i(xi) > 0, f 2 (x 2 ) > 
0 for all xi, x 2 G M, and 

/ OO /»oo 

fi(xi) dx i = l= / f 2 (x 2 ) dx 2 . 

-OO J —OO 

For a random variable Y 1 let E(Y ) and P(P) represent the expected vector and the expected 
squared distance of Y. Let i and j be the unit vectors in the positive directions of Xi and x 2 -axes 
respectively. By the position vector a of a point A, it is meant that OA = a. In the sequel, we 
will identify the position vector of a point (oq, a 2 ) by (cq, a 2 ) := a\i + a 2 j, and apologize for any 
abuse in notation. For any two vectors u and v, let u-v denote the dot product between the two 
vectors u and v. Then, for any vector v, by (P) 2 , we mean (P ) 2 := P-P. Thus, |P| := Vv ■ P, which 
is called the length of the vector P. For any two position vectors a := (cq, a 2 ) and b := ( 61 , b- 2 ), 
we write p(a, b) := ((cq - bi, a 2 - b 2 )) 2 = (ai - 61) 2 + (a 2 - b 2 ) 2 . 

Let us now prove the following lemma. 

Lemma 2.1. Let X = (X 1; X 2 ) be a bivariate continuous random variable with uniform distri¬ 
bution taking values on the triangle A. Then, 

E(X) = (E(X 1 ),E(X 2 )) = (1 A) and V(X) = V(X,) + V(X 2 ) = 1 
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Proof. We have 


E(Xf) — I xifi(xi) dxi = I 4x 2 dxi + / 4 (1 — xi) Xi dxi =-, 


.VI . 

2 4 


E(X 2 )= I x 2 f 2 (x 2 )dx 2 = I —~j= (l ~t= ) x 2 dx 2 — 


' — OO 

poo 


2 

2x 

VsV Vs 

i 


73 
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E(X{) = / x\f\(x\) dxi — / 4x^ dxi + / 4 (1 — x\) x\ dx± = —, 

,/n ./i 24 


vl , 
2 4 


£p7) = y x 2 J 2 (x 2 ) dx 2 = I -J= ^1 - ^=J xl dx 2 = L, 

and so, 

*(Jt) = IJ(xii + x 2 j)f(x h x 2 )dx 1 dx 2 = i j x 1 f 1 (x 1 )dx I +j[ x 2 j 2 {x 2 )dx 2 
= (E(X 1 ),E(X 2 )) = (i ^), 

V(AT0 = E(A’f) - ^(A ,)] 2 = 4 and C(A 2 ) = E( A' 2 ) - [£(A 2 )] 2 = 4 
Thus, we have 

A(X) = E\\X - E{X )|| 2 = If ((x x - ^(Ad )) 2 + (x 2 - -E’(-A 2 )) 2 j/(x 1 ,x 2 ) dx±dx 2 , 
which yields, 

V^(A) = |(x! - P(A 1 )) 2 /i(x 1 ) dx\ + /(x 2 - E{X 2 )) 2 f 2 {x 2 ) dx 2 = V^X,) + V{X 2 ) = -L. 

Hence the lemma. □ 

Note 2.2. We have P(Xi) = | and E(X 2 ) = and so by the standard rule of probability 
theory, for any two real numbers a and b, we deduce if (Ad — a) 2 = E( X 1 — |) 2 + ( a — |) 2 = 
V(Xi) + (a — |) 2 ) and similarly E(X 2 — b) 2 = V(X 2 ) + (b — ^p) 2 . Thus, for any (a, b) E M 2 , 
we have E\\X - (a, 6) || 2 = // r2 [(xi - a ) 2 + (x 2 - b) 2 ]f(x 1 , x 2 )dx 1 dx 2 = f R (%i - a) 2 fi( x i)^ x i + 
fjx 2 - bff 2 (x 2 )dx 2 = E(Xi - a) 2 + E(X 2 - b) 2 = V(A0 + V(A 2 ) + (a - i ) 2 + (6 - f ) 2 = 
A + ||(o,6)-(l,f)|| 2 . 


2x 5 


Note 2.3. From Note |2.2] it is clear that the optimal set of one-mean consists of the expected 
vector (|, ^) of the random variable X, which is the centroid of the triangle A and the corre¬ 
sponding quantization error is A, which is the expected squared distance of the random variable 
X. 


3. Optimal sets of 2-means 

In this section we obtain all the optimal sets of two-means and the corresponding quantization 
error. Let A be the equilateral triangle with vertices 0(0, 0), A{ 1,0), and P(|, V)- Let us divide 
the triangle A by a straight line t into two regions. Let us first assume that the vertex O is in 
one side of £ and the vertices A and B are in the other side of i. It might be that one of A and 
B lies on the line l. Thus, the triangle A is divided into two regions: the triangle OCD and the 
quadrilateral CABD , where C and D are the points of intersections of the line with the sides 
OA and OB respectively. If either A or B is on the line £, then CABD will also be a triangle. 
Let P and Q be the centroids of the regions OCD and CABD respectively. Let the position 
vectors of A, B, P, Q , C, D be denoted respectively by a, b , p, q, c, d. Then, there exist scalars 
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B 



Figure 1. Optimal configuration of two points P and Q. 


a and (3 such that c = aa, d = /3b, p = |(c + d) = |(cm + (3b), and the area of the triangle 
OCD = ^a/3. Since the probability measure is uniformly distributed over A, taking moments 
about the origin, we have 


|(a + b)^ — |(aa + /3b)^-a(3 _ a + b - a(3(aa + [3b) 


h3 


Q = 


If P and Q form an optimal set of two-means, then CD will be the boundary of their cor¬ 
responding Voronoi regions, and so we have \CP\ = \CQ\ and \dP\ = \D(*)\, i.e., (cP ) 2 = 
( CQ) 2 and (dP ) 2 = (DC*)) 2 . Using the dot product of vectors, we have a 2 = b 2 = 1 and 
a ■ b — 1 • 1 ■ cos | = Then, (Cp ) 2 = (C (^) 2 implies 

'a + b — a/3(aa + (3b) 


3(1 - a/3) 


1 ~ \ 2 

-(aa + /3b) — ad ) = 
o / 


— aa 


3(1 - a[3) 

which after simplification yields 

(2) 4 a 3 p + a 2 (3 2 - 6a 2 /3 - 5a 2 - 2a(3 3 + 3ap 2 - 2a[3 + 9a + /3 2 - 3 = 0. 

Due to symmetry, (dP ) 2 = (D(p) 2 yields, 

(3) 4 a/3 3 + a 2 (3 2 — 6 a/3 2 — 5 (3 2 — 2 a 3 [3 + 3 a 2 [3 — 2 a(3 + 9/3 + a 2 — 3 = 0. 

Solving (j2| and Q, we get the five sets of solutions for a and [3: {a = \, (3 = 1}, {a = 1, [3 = 
}, {a = 1, /3 = 1}, {a = |(—1 — \/5), [3 = |(-1-\/5)}, and {a = |(v/5 — 1), / = |(a/5 — 1)}, 
among which the admissible solutions are {a = |, [3 = 1}, {a — 1, /3 — ^}, {a = |(\/5 — 1), [3 = 
t/a/5— 1)}. If {a — |, [3 — 1}, then the line C passes through the vertex B , and if {a = 1, (3 = |}, 
then the line C passes through the vertex A. Let us first take {a — /3 — 1}. Then, p = (|, 

and q — (|, and the corresponding quantization error 

A rVsx! 4(( Xl _ 1)2 + ( X2 _ _^)2) 


a/3 


-dx 2 dxi 


+ 



-V3{X!-1) 4(( Xl - |) 2 + (x 2 - ^) 2 ) i 

- 7 =-—- dx 2 dx\ = — = 0.0555556. 

a/3 18 


Similarly, it can be shown that if {a = 1, [3 = |}, then the quantization error is 0.0555556. Now 
take a = [3 = \(Vd-l). Then, p = (0.309017^0.178411) and q = (0.618034,0.356822), and the 
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B 



Figure 2. Optimal configuration of three points P, Q and R. 


corresponding quantization error 

/U(V 5 -i /Vax, 4 ^ _ 0.309017) 2 + (x 2 - 0.178411) 2 ) 


V3 


dx 2 dx\ 


+ 


+ 


+ 


+ 


rh(V5-i) r' 2 (Vi5-Vs)-Vsx l 4 ^ Xi _ 0.309017) 2 + (x 2 - 0.178411) 2 ) 
i(V5-l) Jo a/3 

4((a?i - 0.618034) 2 + (x 2 - 0.356822) 2 ) 

7 ! 




'i(Fs-i) JliVTs-Vty-Vsx! 
rHV 5-1) /•- V3(®1-1) 


dx 2 dx\ 


dx 2 dx\ 


4((xi - 0.618034) 2 + (x 2 - 0.356822) 2 ) 


'§ (a/ 5-1) 2o 

= 0.0532767. 


4(a/15—v/ 3)—\/3xi a/3 

,-v^Oi-i) 4 ^ Xi _ 0.618034) 2 + (x 2 - 0.356822) 2 ) 

7 ! 


dx 2 dx\ 


dx 2 dxi 


Since 0.0532767 < 0.0555556, an optimal set of two-means is obtained for a = (3 = {(75 — 1), 
i.e., the set {(0.309017,0.178411), (0.618034,0.356822)} forms an optimal set of two-means, 
and the two means lie on the median passing through the vertex O (see Figure 1). Notice that 
g 1 = |(\/5 — 1), where g := 7/ 1 is the golden ratio. Since a = (3 = g~ l , we can say that 
the line £ is parallel to the side AB, and cuts the triangle A into an equilateral triangle and an 
isosceles trapezoid. Due to symmetry, the line £ can also be parallel to either OA or OB , i.e., 
the two means can also lie either on the median passing through the vertex B, or on the median 
passing through the vertex A. Moreover, it can be seen that 

Area of the isosceles trapezoid CABD |v / 3(a/5 — 1) a/5 — 1 g 2 

Area of the equilateral triangle OCD |a/3(3 — 75) 3 — 75 9 ^ 

Therefore, we can deduce the following theorem. 

Theorem 3.1. Let X be a random variable with uniform distribution on the equilateral triangle 
A with vertices (0,0), (1,0), and (|, ^). Then, there are three optimal sets of two-means with 
quantization error 0.0532767. If the triangle A is partitioned into an isosceles trapezoid and 
an equilateral triangle in the golden ratio, then the centroids of the isosceles trapezoid and the 
equilateral triangle form an optimal set of two-means. 


4. Optimal set of 3-means 

Theorem 4.1. For uniform distribution on the equilateral triangle with vertices (0,0), (1,0) 
and (|, ^), the set {(^, ^/^), (|j, (7 T 273 )} * s on ^ optimal set of three-means. The 

three means in this case form an equilateral triangle having the sides parallel to the sides of the 
original triangle. 
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Proof. Due to symmetry of the triangle with the uniform distribution, we can assume that one 
element in the optimal set of three-means lies on a median of the triangle, and the other two are 
equidistant from the median. As shown in Figure 2 , let the median passing through the vertex 
B cuts the side OA at the point N, and let one element in the optimal set of three-means lie 
on this median. Let the boundaries of the Voronoi regions cut the sides OB and AB at the 
points C and D respectively. Let the three boundaries of the Voronoi regions meet at the point 
M which lies on the median BN. Let the position vectors of the points A, B , C, D, M, N be 
respectively a, 6 , c, d, rh, h. Let a and (3 be two scalars such that the length of BC equals 
a and the length of BM equals ^/3. Due to symmetry, the length of BD is also a. Then, 
c = (1 — a) 6 , d = era + (1 — a) 6 , and rh = f3h + (1 — (3)b. Area of the triangle BCM = 
Area of the triangle BDM = ^-a/3. Let the centroids of the quadrilaterals ONMC, NADM, 
and BCMD be P, Q, and R with position vectors p, q, and r respectively. Since the probability 
measure is uniformly distributed over A, taking moments about the origin, we have 

|(6 + h)~ — |(6 + c + rh)^-a/3 b + h — (b + c +m)a/3 

P V H ~ “ 01 ^ 


_ _ |(a + 6 + n)^ — |(6 + d + m) y ^-a(3 _ a + b + h — (b + d + m)a(3 
q = f “faS = 3(1 - a/3) ’ 

\{b + c + m)^-a(3 + |(6 + d + ?h)^a/3 c + d + 2{b + rh) 

^-a/3 6 

If P, Q and R be the optimal points, we must have (Pfl ) 2 = (P(5) 2 , ( /M/ ) 2 = (Pill) 2 , (Pill ) 2 = 
(q1 il ) 2 and (Pi3 ) 2 = (Qti) 2 . Using the dot product of vectors, we have a 2 = 1, 6 2 = 1, h 2 = 
|, a • h — a • 6 — 6 ■ h — Then, (P(3 ) 2 = (P(3 ) 2 implies, 


1 — a)b — 


c + d T 2(6 + rh) 
6 


= ((1 — a)b — 


b + n — (6 + c + m)a(3 \ 2 


3(1 — a/3) 


- 


which after simplification yields 

(4) 5 a 4 /3 2 + 6 a 3 (3 + a 2 (6/3 2 - 28 (3 - 15) - 6 a{/3 3 - 2/3 2 + 2/3-7) + 3/3 2 - 13 = 

(Pill ) 2 = (PA /) 2 implies 


0. 


(/3n + (1 - (3)b - 


c + d + 2(6 + rh) 


) 2 = ((3h + (l-(3)b- 


6 + n — (6 + c + m)a/3 , 


6 7 v '" ' v 3(1-a/3) 7 

which after simplification yields 

(5) a 4 (-/3 2 ) - 6 a 3 f3 + a 2 (6/3 2 + 14/3 + 3) + 12a/3 (/3 2 - 2/3- 1)- 15 /3 2 + 36(3 - 13 = 0. 
Solving the equations Q and ([5]), we have a — \ and (3 = |. Then, we have p = V+V Q — 


(H, 7 ^=), and r = (§, ^A)- Moreover, c = (|, % an d d = (|, ^). Here the equation of the 
line OB is x 2 = v3xi, and the equation of the line CM is x 2 = — 7 ^ 7 - Thus, if U 3 (P) is the 


quantization error due to the point P in its Voronoi region, then we have 

f^ Xl 4 ((a - +i ) 2 + (A - ^rg) 2 ) 


Vs(P) = 


+ 


'0 Jo 

x-^ — 1 

Vs 


V3 

4((A - ^) 2 + (x 2 - ^) 2 ) 


-dx 2 dx\ 


V3 


dx 2 dx 1 = 


11 

1296' 


Due to the uniform distribution and the symmetry of the points, we have V^{P) = V^/Q) = 
V 3 (P) = jdH. Thus, the set {(^, gdyg), (§|, ^ 775 ), (|, 775 )} forms an optimal set of three-means 
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Figure 3. Optimal configuration of four points P, Q, R and S. 

with quantization error V 3 = 3 x — AL. Notice that the points (^, A^) and (||, lie 
on the medians passing through the vertices O and A respectively, and the three points in this 
case form an equilateral triangle having the sides parallel to the sides of the original triangle. 
Thus, due to symmetry, we can say that the set {(^, ^ 7 §), 24 ^ 3 )’ (I’ 1273 )} is tlie only 

optimal set of three-means. Hence, the proof of the theorem is complete. □ 

5. Optimal sets of 4-means 

In this section we calculate the optimal sets of four-means. Let OAB be the equilateral 
triangle with vertices (0,0), (1,0) and (|, ^). As shown in Figure 3, let BN be the median of 
the triangle passing through the vertex B which cuts OA at the point N. Let {P, Q, R , S} be 
an optimal set of four-means, where P , Q are on the median BN] and R , S are in the opposite 
sides of the median. Notice that, our assumption is also verified by a numerical search algorithm 
as mentioned in the next section. Let CD be the boundary of the Voronoi regions of the points 
P and -R, DF be the boundary of the Voronoi regions of the points P and Q which cuts the 
median BN at the point M, FG be the boundary of the Voronoi regions of the points P and 
S. Let DN\ and FN 2 be the boundaries of the Voronoi regions of the points R , Q and Q , S 
respectively. Let a, /3, 7 , 5 be four constants such that BC = BG = a, ON\ = AN 2 = 5, 
BM = ^/3] x 1 -coordinate of D be 7 , and so due to symmetry aq-coordinate of F is 1 — 7 . Then 
we have, 


a = (!, 0 ), 6 = (|, qr), w=(£, 0 ), 

c = (1 — a)b, d = ( 7 , |v / 3(l — /?)), g = da + (1 — a)b, 

m = 6(1 — j3) + /3n, hi = (5, 0 ), h 2 = (1 — 5, 0 ), 

/ = (l- 7 , |V3(1- /?))• 


The equation of the line CD is x 2 = |\/3(1 — (3) + 7 K The equation of the line BD 

is x 2 = ^ ^ If Ari is the area of the triangle BCD , then 




Arx = 


' 1 -a / l./5/i a\ 1 — )) 

2 2 V,3 U P) tj+a-y-l 


1 dx 2 dx 1 + 


rV3xi 

Is/Sftx-i-l) V3 
I-27 ' 2 


1 dx 2 dx 1 


gv^(“ + -1)W + 27-1)- \~Jih -77% + iyvly 

a/37 a/3 


dM _ Iy 3 7 2 
8 2 


2 
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If Ar 2 is the area of the triangle BDF , then Ar 2 = 27 ^ — 2y). If Ar 3 is the area 

of the triangle BFG, then Ar 3 = A 17 . If Ar 4 is the area of the triangle OCD , then 


r ' 2 “ r^ x 1 n r 2 

Ar 4 = / / 1 dx 2 dx\ + / / 

Jq / 1 —q J 

27 2 27 


i V 3(1_ /3) + 73(^«^ F 7 


1 dx 2 dx 1 


\/3(a - l) 2 (/3 + 2 7 -l) \/3(ct - l)(a + 2 7 - l)(/3 + 2 7 - 1) 


I 67 


I 67 


If Ar ,5 is the area of the triangle ODN \, then Ars = ^ 2. 2 ^ . If Arg is the area of the triangle 
DiVA 2 , then Ar 6 = = |\/3(1 —/3)( 1 — 25). If Ar 7 is the area of the triangle DFN 2 , 

then Ar 7 = = |v / 3(l — /3)(1 — 27 ). Notice that due to symmetry, if Ar 8 is the area 

of the triangle FN 2 A and Ar 9 the area of the triangle FAG , then Ar 8 = Ar 5 and Ar 9 = Ar 4 . 
As P, Q, R, S are assumed to form an optimal set of four-means, they are also the centroids 
of their corresponding Voronoi regions associated with the density function f(xi,x 2 ) which is 
constant due to the uniform distribution over the triangle. Thus, P, Q, P, S are respectively 
the centroids of the pentagon BCDFG, quadrilaterals DN\N 2 F , OCDNi, and AN 2 FG. Hence, 
we have 

|Ar 4 (6 + c + d) + |Ar 2 (b + d + /) + |Ar 3 (6 + / + g) 

P Ar 4 + Ar 2 + Ar 3 

^Ar 7 (d + / + h 2 ) + ^Ar 6 (d + h\ + h 2 ) 

0 = — --- 

Ar 6 + Int 7 

_ _ |Ar 4 (c + d) + |Ar 5 (d + hi) 

Ar 4 + Ar 5 

„ _ |Ar 9 (a + / + g) + |Ar 8 (a + / + h 2 ) 

Ar 8 + Ar 9 

Write Q1 := p(p,c) -p(c,r), Q2 := p(p, d) -p(d,r), Q3 := p(q,d) -p(d,r), and Q4 := p(g, h 4 )- 
p(hi, r). Since the line passing through the boundary of the Voronoi regions of any two points 
in an optimal set of n- means, n > 2 , is the perpendicular bisector of the line segment joining 
the two points, we must have Q 1 = 0, Q2 = 0, Q 3 = 0 and Q 4 = 0. Using Mathematica, we 
solve these four equations for the parameters a, (3, 7 and S up to 20 decimal places and obtain 


a = 0.49729450782679201845, (3 = 0.57487645285849021867, 

7 = 0.34568004381771961464, 5 = 0.38346841237225538981. 

Now, using the above values of a, (3, 7 , and 5 we obtain the position vectors p, q, r and s as 
follows: 

p= (^,0.5436907490155839431), 

q = (^,0.1926448341274137497), 

f = (0.2302330149367283460,0.1649562245075873150), 
s = (0.769766985063271654,0.1649562245075873150). 

Hence, the points (^,0.5436907490155839431), (|, 0.1926448341274137497), 
(0.2302330149367283460,0.1649562245075873150) 

and (0.769766985063271654, 0.1649562245075873150) form an optimal set of four-means. Notice 
that due to symmetry there are three optimal sets of four-means. As before, we can also calculate 
the quantization error in this case. 
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Figure 4. Results of a numerical search algorithm for 1 < n < 21 points, rotated 
so that the symmetry axis is vertical. 

6 . Optimal sets of u-means 

As the number of points increases, so does the number of algebraic equations to be solved. We 
apply a numerical search algorithm that makes random shifts to the point locations, accepting 
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better configurations, and gradually decreasing the shift amplitude in the absence of improve¬ 
ment. In Figure [4] we present the results of this numerical search for n < 21 points. Based on 
these results we make the following conjectures (“most” means a set with density greater than 
1 / 2 ): 

Conjecture 6.1. For most n, there is an optimal configuration with at least one line of sym¬ 
metry. 

In Figure [4] this line of symmetry is chosen to be vertical. In each case the number of points 
on each side of the vertical line is equal, however for n = 8 and n = 19, the locations of points 
do not appear to be quite symmetrical. 

We also note that when n is a triangular number, the points lie very close to a triangular 
lattice, and for other values, are located in identifiable rows, and are close to the union of two 
subsets of triangular lattices. Specifically 

Conjecture 6.2. For most n, there is an optimal configuration with N = |_-\/2nJ rows. The jth 
row has j points for j < J where J = N — \n — N(N + 1)/2|. If n > N(N + l)/2 the rows with 
j > J each have one extra point (so, the jth row has j+ 1 points), while if n < N(N + l)/2 they 
each have one fewer point (so, the jth row has j — 1 points). 

Notice that [v / 2nj identifies the closest triangular number to a natural number n. The 
conjecture is not stated for all n as possible exceptions are n = 12 (wrong number of rows) and 
n = 14 (wrong distribution of points in rows). 

When n is a triangular number N{N + l)/2, the locations are close to a triangular lattice, 
and it is possible to obtain a good bound on the quantization error: 

Theorem 6.3. When n = N(N + l)/2 for some positive integer N > 3, the quantization error 
is controlled by the bound 

451V 3 - 28y/21N 2 + (301 - 28^)^ - 98 _ 5 14^-45 4 

n ~ 3241V 3 (IV - l ) 2 ~ 361V 2 1621V 3 + ^ ’’ 


Proof. The proof is by direct calculation for the specific configuration shown in Figure [5] The 
points lie on a triangular lattice aligned with the triangular domain and have Voronoi regions 
as shown. There are two parameters, the lattice spacing d, and the distance from any of the 
edge or corner points to the edge of the triangle a. We set L to be the side length of the large 
triangle (set equal to unity at the end), so that the area is Area = L 2 y / 3/4. We then have 

L = (IV — 1 )d + 2\/3 a. 


It is convenient to make d the subject of this equation and substitute into the expressions 
below. Placing a point at the origin, we can find the quantization error due to right triangular 
or rectangular domains: 


e/n/3 


Vs 


W / 6 (r) = dx 

Jo 

nr 

14/3 (rj = dx 

Jo Jo 

nl nw 

14ect4w) = dx dy 


dy 


x 2 + y 2 
Area 


10r 4 


dy 


x 2 + y 2 
Area 


x 2 + y 2 
Area 


27 L 2 ’ 

_ 2 r 4 

“ IP' 

4 lw(l 2 + w 2 ) 
3 V3L 2 


Then, each point has a combination of these contributions V^enter = 1214/6(d/2), ^edge = 

614 /e4/ 2 ) + 214ect(d/ 2 , a), Worrier = 214/e(d/2) + 2V rec i(d/2, a) + 214 / 3 (a), and the overall 
quantization error (giving a bound for the optimal quantization error) is a sum of these, counting 
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FIGURE 5. The construction in the proof of Theorem 6.3, illustrating the split of 
the Voronoi regions of centre, edge and corner points into triangles and rectangles. 
The value of a chosen is the optimal value a Q pt defined below. 

the number of points of each type 
(N-3)(N -2) 


v; < 


^center + %( N - 2)F e( jg e + 3F CO rner 
UAy/3a 4 N(N - 2) + 144a 3 iV(iV - 2 )L + 144 v / 3a 2 L 2 - 84aL 3 + 5^3 L 4 


^ 144(iV — l ) 2 

Expanding for large N and L, keeping both quantities at the same order, gives to leading order 
the optimal 

_ V7L 
a °pf - -qn 

which, substituted into the expression ([ 6 ]) gives the stated result. □ 

In the general case (arbitrary n) we have an asymptotic result: 


Corollary 6.4. The quantization error satisfies 

5 


v; < 


72 n 


0(n _3/2 ) 


as n —> oo. 


Proof. This follows from Theorem 6.3 For arbitary n, the distance to the previous triangular 
number is order a fn. Thus we can add the extra points without increasing the leading term of 
the quantization error. □ 

We expect that the triangular lattice is optimal to leading order, so that < may be replaced 
by rv-/ # Furthermore, by placing a triangular lattice within a more general domain, we expect 
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Conjecture 6.5. If we consider a measure P uniform on a domain with finite area A and finite 

perimeter, then as n —* oo, 

T/ 5 x/3,4 
54n 
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